A multi-point Metropolis scheme with generic weight 

functions 



Luca Martino, Victor Pascual Del Olmo, Jesse Read 

Department of Signal Theory and Communications, 
Universidad Carlos III de Madrid. 
Avenida de la Universidad 30, 28911, Leganes, Madrid, Spain. 
Iuca@tsc.uc3m.es, vpolmo@tsc.uc3m.es, jesse@tsc.uc3m.es 



Abstract 

The multi-point Metropolis algorithm is an advanced MCMC technique based 
on drawing several correlated samples at each step and choosing one of them 
according to some normalized weights. We propose a variation of this tech- 
nique where the weight functions are not specified, i.e., the analytic form 
can be chosen arbitrarily. This has the advantage of greater flexibility in 
the design of high-performance MCMC samplers. We prove that our method 
fulfills the balance condition, and provide a numerical simulation. We also 
give new insight into the functionality of different MCMC algorithms, and 
the connections between them. 

Keywords: Multiple Try Metropolis algorithm, Multi-point Metropolis 
algorithm, MCMC methods 



1. Introduction 



Monte Carlo statistical methods are powerful tools for numerical inference 
and stochastic optimization (see Robert and Casella (2004), for instance). 



Markov chain Monte Carlo (MCMC) methods are classical Monte Carlo 
techniques that generate samples from a target probability density function 
(pdf) by drawing from a simpler proposed pdf, usually to approximate an 



otherwise-incalculable (analytically) integral (Liu, 2004 Liang et al. , 2010) 



MCMC algorithms produce a Markov chain with a stationary distribution 
that coincides with the target pdf. 



The Metropolis-Hastings (MH) algorithm (Metropolis et al. , 1953 Hast 



ings, 1970) is the most famous MCMC technique. It can be applied to almost 
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any target distribution. In practice, however, finding a "good" proposal pdf 
can be difficult. In some applications, the Markov chain generated by the MH 
algorithm can remain trapped almost indefinitely in a local mode meaning 
that, in practice, convergence may not be reached. 

The Multiple-Try Metropolis (MTM) method of |Liu et al] ( |2000[ ) is an 
extension of the MH algorithm in which the next state of the chain is selected 
among a set of independent and identically distributed (i.i.d.) samples. This 
enables the MCMC sampler to make large step-size jumps without a lowering 
the acceptance rate; and thus MTM is can explore a larger portion of the 
sample space in fewer iterations. 

An interesting special case of the MTM, well-known in molecular simula- 
tion field, is the orientational bias Monte Carlo, as described in Chapter 13 of 



Frenkel and Smit (1996) and Chapter 5 of Liu (2004), where i.i.d. candidates 



are drawn from a symmetric proposal pdf, and one of these is chosen accord- 
ing to some weights directly proportional to the target pdf. Here, however, 
the analytic form of the weight functions is fixed and unalterable. 



Casarin et al. (2011) introduced a MTM scheme using different proposal 



pdfs. In this case the samples produced are independent but not identi- 
cally distributed. In Qin and Liu ( |2001 ), another generalization of the MTM 
(called the multi-point Metropolis method) is proposed using correlated can- 
didates at each step. Clearly, the proposal pdfs are also different in this 
case. 

Moreover, 



in 



Pandolfi et al. (2010) an extension of the classical MTM 



technique is introduced where the analytic form of the weights is not specified. 
In Pandolfi et al. (2010), the same proposal pdf is used to draw samples, so 
that the candidates generated each step of the algorithm are i.i.d. Further 
interesting and related considerations about the use of auxiliary variables 
for building acceptance probabilities within a MH approach can be found in 



Storvik (2011). 



In this paper, we draw from the two approaches (Qin and Liu, 2001 ) and 



(Pandolfi et al. 2010) to create a novel algorithm that selects a new state 
of the chain among correlated samples using generic weight functions, i.e., 
the analytic form of the weights can be chosen arbitrarily. Furthermore, we 
formulate the algorithm and the acceptance rule in order to fulfill the detailed 
balance condition. 

Our method allows more flexibility in the design of efficient MCMC sam- 
plers with a larger coverage and faster exploration of the sample space. In 
fact, we can choose any bounded and positive weight functions to either im- 
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prove performance or reduce computational complexity, independently of the 
chosen proposal pdf. Moreover, since in our approach the proposal pdfs are 
different, adaptive or interacting techniques can be applied, such as those 
introduced by Andrieu and Moulines (2006); Casarin et al. (2011). An im- 



portant advantage of our procedure is that, since in our procedure a new 
candidate is drawn from a conditional pdf which depends on the samples gen- 
erated earlier during the same time step, it constructs an improved proposal 
by automatically building on the information obtained from the generated 
samples. 

The rest of the paper is organized as follows. In Section [2] we recall the 
standard multi-point Metropolis algorithm. In Section [3] we introduce our 
novel scheme with generic weight functions and correlated samples. Section [4] 
provides a rigorous proof that the novel scheme satisfies the detailed balance 
condition. A numerical simulation is provided in Section [5] and finally, in 
Section [6j we discuss the advantages of our proposed technique and provide 
insight into the relationships among different MTM schemes in literature. 



2. Multi-point Metropolis algorithm 

In the classical MH algorithm, a new possible state is drawn from the 
proposal pdf and the movement is accepted with a suitable decision rule. In 
the multi-point approach, several correlated samples are generated and, from 
these, a "good" one is chosen. 

Specifically, consider a target pdf p (x) known up to a constant (hence, 
we can evaluate p(x) oc p (x)). Given a current state x G K (we assume 
scalar values only for simplicity in the treatment), we draw iV correlated 
samples each step from a sequence of different proposal pdfs {ftj}f=i, i.e., 

yt ~ tti(-|x),?/2 ~ ir 2 (-\x,yi),y 3 ~ ir 3 (-\x, y x , y 2 ), ... 
...y N ~ 7TN (.\x,y lj ...,y N _ 1 ). 

Therefore, we can write the joint distribution of the generated samples as 

Qn(V1, -,yN\x) = qN(yi:N\x) = TTl{yi\x)7T 2 (y2\x,yi) • ■■7T N (yN\x,yi:N~l), (2) 

i.e., 

N 

qN(yi-.N\x) =7ri(yi|x)]^7r i (2/j|x,yiy_i) (3) 

3=2 
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where, for brevity, we use the notation y^j = ...,%■] and = ...,yi] 
denotes the vector with the reverse order. 

A "good" candidate among the generated samples is chosen according to 
weight functions 

u j (z 1 ,...,z j+1 ) E R j+1 -»■ R + 

where zi, Zj + i, are generic variables and j = 1, N. The specific analytic 
form of the weights needed in this technique is 

U}j(z!, ...,Zj+i) = p(zi)'K 1 (z2\zi) ■ ■ ■7T N (z j \zi : j- 1 )X j (zi, ....,Zj+i), (4) 

where p(x) oc p (x) is the target pdf, Xj can be any bounded, positive, and 
sequentially symmetric function, i.e., 

\j(Zl, Z2:j+l) = \j(Zj+i:2,Zi). (5) 

Note that, since qj(z2-.j+i\zi) = 7Ti(z2\zi) ■ ■ • ^(^jl^iy-i) (see Eq. Q), we 
can rewrite the weight functions as 

u j (z 1 , z j+ i) = p(zi)q j (z 2 ;j+i\z 1 )\ j (z 1 , z 2:j+1 ). (6) 

2.1. Algorithm 

Given a current state x = Xt, the multi-point Metropolis algorithm con- 
sists of the following steps: 

1. Draw samples y\-N = [2/1,2/2, ■■■,Vn] from the joint pdf 

N 

qN(yi-.N\x) = ni( l yi\x)'Y[n j {y j \x,y 1: j- 1 ) 

i=2 

namely, draw yj from 7Tj(-\x,yx.j-i), with j = 1, ...,N. 

2. Calculate the weights U}j(yj : i,x) as in Eq. Q, and normalize them to 
obtain cjj, j = 1, N. 

3. Draw a y = yk G {j/i, j/jv} according to their weights Cui, ...,ojn- 

4. Set 

x* 1 = y k - 1 ,x* 2 = y k -2,...,x* k _ 1 = y 1 , (7) 
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and finally X u — X . Then, draw other "reference" samples 

x* rsJ iri(-\v,x*. j _ 1 ), (8) 
for j = k + 1, N. Note that for j = k + 1 we have 

Kj(-\y> x i:j-i) = *Mvi x i = yfc-ij-j^fc-i = vu x t = x )> 

and, for j = k + 2, N, we have 

Kj(-\y, x l:j-i) =*j{-\y, x i = yk-i,-,x* k _ 1 = yi,x* k = x ,x* k+1 . j _ 1 ). 

5. Compute Uj(x* a ,y) as in Eq. Q. 

6. Let a;j + i = y& with probability 



a = mm 



(9) 



E^iWj(^ :1 ,y)J ' 
otherwise set x t+ i = x with probability 1 — a. 
7. Set i = t + 1 and repeat from step 1. 
The kernel of this technique satisfies the detailed balance condition as shown 



m 



Qin and Liu (2001). However, to fulfill this condition, the algorithm needs 



that the weights are defined exactly with the form in Eq. Q. 

3. Extension with generic weight functions 

Now, we consider generic weight functions Uj(zi, ...,Zj + i) G W +1 — > IR + , 
that have to be (a) bounded and (b) positive. In this case, the algorithm can 
be described as follows. 

1. Draw iV samples yi.N = [2/1,2/2, •••,2/iv] from the joint pdf 

N 

qN{y\-.N\ x ) = ^i(yi\ x )Yl^j(yj\ x ,yi-.j-i) 

namely, draw yj from Hj(-\x, j/i y -_i), with j = 1, N. 
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2. Choose some suitable (bounded and positive) weight functions. Then, 
calculate each weight u>j(yj : i,x), and normalize them to obtain Cuj, j = 



1,...,N. 



3. Draw a y = y k G {yi, ....,yN} according to oui, ...,u>n, and set W y = Uk, 
i.e., 

Wk(yk:l,x) 



w y ± 



Ell W i(%':l^) 



4. Set 

x \ = Vk-li x 2 = Vk-2, • • • j x k-l = Vli 

and finally x* k = x. Then, draw the remaining "reference" samples 



x* ~ TTji-ly^l -.J, 



(10) 

(11) 
(12) 



for j = k + 1, JV. 



5. Compute the general weights Uj(x* :1 ,y) and calculate the normalized 
weight 



6. Set x t+ i = yk with probability 

p(y)7riK|y)7T2(4|y, s|) • • • TT k (x* k \y, x\, x* k _ x ) W x 



a = min 1, 



. ' P(x)7r 1 (y 1 \x)ir 2 (y2\x,yi)---irk{yk\x,yi,...,yk-i) W y 
We can rewrite it in a more compact form as 



a = mm 



\ p{y)qk{x\, k \y) w x 

p{x)q k {yi-.k\x) Wy 



(13) 



(14) 



(15) 



where we recall 



qk{yi-.k\x) = 7Ti(yi|x) Y[^j(yj\x,yi-.j-i)- (16) 

i=2 

where k is the index of the chosen sample y k . 
Otherwise, set x t+1 = x with probability 1 — a. 

7. Set t — t + 1 and repeat from step 1. 

We emphasise that in the algorithm above we have not specifically defined 
the weight functions. 
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3.1. Examples of weight functions 

The weight functions must to be bounded and positive. The choice can 
depend on some criteria such as improving performance or reducing compu- 
tational complexity. If the target density is bounded, two possibilities are 



=p(zi), 



(17) 



or 



w&i, Zj+i) = p{z l )p(z 2 ) ■ ••p(z j+1 ), 
with j = 1, N. Another possible choices are the following 



UJj{Zi, Zj + i) 



_qj{Zl:j\Z j + 1 ) _ 



(18) 



(19) 



where 9 > is a positive constant, or 

p(Zj) p{Zj-x) 



U)AZ\, Zj + i) 



P{zi) 



qi{Zj\Zj +1 ) q2{Zj-i : j\Zj + i 



q j (z 1:j \z j+1 ) 



(20) 



and a third possible choice 



UJj(Zi, Zj +1 ) 



Kj+l(Zl\Zj+l;2) 



(21) 



where 7Tj + i(zi|zj + i :2 ) is the j + 1-th proposal pdf used in the step 1 of the 
algorithm. It is important to remark that the z-variables are ordered such 
that Z\ is the most recently generated sample, Zj is the first drawn sample, 
and Zj + i represents the previous step of the chain. 

Clearly, owing to the great flexibility in the construction of the weight 
functions, it can be difficult to assert which is the best choice in terms of 
performances of the algorithm. However, evidently, in general including more 
statistical information in the weights can improve performance yet, at the 
same time, increases the computational cost of the designed technique. 

More specific theoretical or empirical studies are needed to clear up this 
issue. Indeed, observe that the point of the best selection of the weights is 



Pandolfi et al. (2010), for instance 



even unclear in the classical MTM by Liu et al. (2000), as for the method in 
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3.2. Relationship with the independent multiple tries scheme 



In Pandolfi et al. (2010) i.i.d. candidates are proposed at each time step. 



The acceptance probability a in Eqs. (14)-(15) may appear similar to the 



acceptance probability in Pandolfi et al. (2010). However, note that the ex- 



pression of a in Eq. (15) is different to the acceptance probability in Pandolfi 



et al. (2010) for two main reasons: 

p(y)9fcKfc|y) is distinct (see Eqs. (Il6b), and 



fa) the first factor 



(b) the definition and computation of W x and W y (see Eqs. (10) and (13)) 



are also different since here the weight functions take in account the 
previous generated samples (in the same time step). 

If here we set iTj(yj\x, g/iy-i) = 7r(yj\x) for all j = 1, N, then the steps 



of our algorithm coincides exactly with those of technique in |Pandolfi et al. 
(2010) except for the step 4. Indeed, the way of choosing the "reference" 



points are different in the two methods (in our case, some of them are fixed 



while in Pandolfi et al. (2010) all the reference point are chosen random). 



We can find a specular difference between the methods in Liu et al. (2000) 



and Qin and Liu (2001). 



3.3. Multi-point Metropolis as specific case 

In the case when the weight functions are chosen as in Eq. ([6]), i.e., 

u k (zi, z j+ i) = p(z 1 )q j (z 2 :j + i\zi)\ j (z 1 , z j+ i), where 

Xj{Zl, Z2:j+l) = A_j(Zj + i:2, Z\), (22) 

is sequentially symmetric, then our scheme coincides exactly with the stan- 



dard multi-point Metropolis method in Qin and Liu (2001). Indeed, first of 



all we can rewrite the expression (15) as 



a 



mm 



1 P(y)qk(x* 1:k \y) uJkjx^y) Y./, i ^(//,: I--'') 
' p{x)q k (yi:k\x) u) k (y ka ,x) Y^^A^y) 



(23) 



Then, recalling the Eq. (11), i.e., x\ = y k -\,x% = y k -2 



yi, x l 



X 



and y = y k , the two weights u k (x* k . 1 ,y) and u k (y k: i,x) can be expressed 
exactly as 



Uk(x%i,y) = u k {x* k = x,x* k ^ = yi, ...,x$ 
= p{x)q k {yi :k \x)X k {x, y 1:k ), 



Vk-i,y = Vk) 
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and 



Wfc(j/ fcs i, x) = u k (y k = y, y k ^i = x\, ...,yi = 4-u x = x t) 
= p(y)Qk(x* 1:k \y)Xk(y,xl. k ), 

respectively. Therefore replacing the weights C0k(x k:1 ,y) and u k (y k: i,x) in 
Eq. (23 ), we obtain 



mm 



mm 



that coincides with acceptance probability in Eq. (|9| of the standard multi- 
point Metropolis algorithm. Note that we have considered X k (x,yi :k ) — 
X k {y,x\. k ). Indeed, since x* k = x we can write X k (x, yi :k ) = Xk(y,x*. k _ v x), 
then because = y k -i-.i, we obtain X k (x, y\. k ) = X k (y,y k -i : i,x), and as 

y = y k , finally we have 



X k {x,yi-k) = X k (y k:1 ,x), 



that is exactly the condition assumed in Eq. (22). In the following, we show 
the proposed technique satisfies the detailed balance condition. 



4. Proof of the detailed balance condition 

To guarantee that a Markov chain generated by an MCMC method con- 
verges to the target distribution p(x) oc p (x), the kernel A(y\x) of the cor- 
responding algorithm fulfills the following detailed balance condition^ 

p(x)A(y\x) = p(y)A(x\y). 

First of all, we have to find the kernel A(y\x) of the algorithm, i.e., the 
conditional probability to move from x to y. For simplicity, we consider the 
case x 7^ y (case x = y is trivial). The kernel can be expressed as 

N 

A(y = y k \x) = ^h(y = y k \x,k = j), (24) 

3=1 



1 Note that the detailed balance condition is sufficient but not necessary condition. 
Namely, the detailed balance ensures invariance. The converse is not true. Markov chains 
that satisfy the detailed balance condition are called reversible. 
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where h(y = y k \x,k = j) is the probability of accepting x t+1 = y k given 
Xt = x when the chosen sample y k is the j-th candidate, i.e., when y k = yj. 

In the sequel, we study just one h(y = y k \x, k) for a generic k G {1, N}. 
Indeed, if h(y = y k \x, k) fulfills the detailed balance condition (it is symmetric 
w.r.t. x and y), then A(y\x) also satisfies the detailed balance because it is a 
sum of symmetric functions. Therefore, we want to show that 

p(x)h(y\x, k) = p(y)h(x\y, k), 

for a generic k G {1, ...,N}. Following the steps above of the algorithm, we 
can write 



p(x)h(y\x,k) = 



p(x) 



N 



Y[n j (y j \x,y 1:j - 1 ) 



j'=i 
min 



U k (yk:l,x) 



X) 



N 



II *i( X *\y> X l:i-l) 
;=fc+i 



\ P(y)Qk{x* 1:k \y) w x 
' P(x)q k (yi:k\x) W y 



dyi:k-idyk+i-.Ndx* k+1 . N . 



Note that each factor inside the integral corresponds to a step of the method 
described in the previous section. The integral is over all auxiliary vari- 
ables. Recalling the definition of the joint probability q k {yi±\x) and W y , the 
expression can be simplified to 



p(x)h(y\x, k) = 
= p(x) J •■■J 1k(yi-. 



k\x) 



N 



II ^-(i/jk.yiy-i) 
j=k+l 



Wy 



N 



11 lr i( x *\y> x l:i-l) 
i=k+l 



mm 



1 p(y)Qk(x* 1:k \y) 



' p(x)q k (y 1:k \x) Wy 
and we only arrange it, obtaining 
p(x)h(y\x, k) = 



dy v . k -idy k+1:N dx* k+v 



N j 



J ■ ■ J p(x)q k (yi:k\x)W y 



N 



n ^(fil^J/iy-i) 
j=fc+i 



AT 



i=fe+l 



mm 



p(a;)%(yi:fek) w y 



dy 1:k -idy k+1 . N dx* k+1 . N . 
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Now, we multiply the two members of the function min[-, •] by the factor 
p{x)qk(yi:k\x)W y so that 



p(x)h(y\x, k) = 




min [p{x)q k {yi,k\x)W y ,p(y)q k (xl. k \y)W x \ dyi lk -idyk+i.. N dx* h+liN . 

Therefore, it is straightforward that the expression above is symmetric in x 
and y. Indeed, we can exchange the notations of x and y, and x* and yj, 
respectively, and the expression does not vary Then we can write 

p(x)h(y\x, k) = p(y)h(x\y, k). (25) 

We can repeat the same development for each k obtaining 

p(x)A(y\x) = p(y)A(x\y), (26) 

that is the detailed balance condition. Therefore, the generated Markov chain 
converges to our target pdf. 



5. Toy example 



Now we provide a simple numerical simulation to show an example of 
multi-point scheme with generic weight functions and compare it with the 
technique in Pandolfi et al. (2010). Let X e R be a random variable^] with 
bimodal pdf 

p (x) oc p(x) = exp {-(x 2 - 4) 2 /4} . (27) 
Our goal is to draw samples from p a (x) using our proposed multi-point tech- 



nique. 



We consider a Gaussian densities as proposal pdfs (a standard choice) 



Kj(yj\xt,yi:j-i) « exp 



2a 2 



where we use a 2 = 1 and 



7i 
i - 1 



yXt + Vi + •■■ + Vi-2) + 722/i-i, 



(28) 



(29) 



2 Note that we consider a scalar variable only to simplify the treatment. Clearly, all the 
considerations and algorithms are valid for multi-dimensional variables. 
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i.e, ju is a weighted mean (ji + 72 = 1) of the previous state x t and the 
previous generated samples (at the same time step). Specifically, we set 
71 = 0.2 and 72 = 0.8. 

Moreover, we choose very simple weight functions depending only on first 
variable and on the target pdf 

uf\z 1 ,z 2 ,....,z J+1 ) = [p(z 1 )] e , (30) 

with 9 = 1/2. Note that p(-) is bounded and also positive (since it is a pdf). 



This kind of weights cannot be used in the multi-point scheme of Qin and 



Liu (2001), expect for 9 = 1 and using a specific sequence of the proposal 
pdfs. Moreover, for 9 = 1 this weight function can be also used in a standard 
MTM of Liu et aL] ( |2000 ) if the chosen proposal density ii(y\x) is symmetric 
(i.e, ir(y\x) = n(x\y) and choosing X(x,y) = ^y)- 

We also compare the performances of the proposed algorithms with the 
weights as 



00 



(2), 



and 



00 



zi, zj+i) = P(zi)p(z 2 ) ■ ■■p(z j+ i) 

P(zi) 



(3) 



(zi, 



(31) 



(32) 



^j+l(Zl\Zj+l;2) 

Then, we run the proposed multi-point algorithm with different numbers N 
of candidates and calculate the estimated acceptance rate (the averaged prob- 
ability of accepting a movement) and linear correlation coefficient (between 
one state of the chain and the next). We also run the method in Pandolfi 

(Vj-xt) 2 



et al. (2010) with proposal pdf 7i(yj\x t ) oc exp |— 2o . 2 j and compare the 

performances, using weight functions as in Eq. (30) and third type in Eq. 

(32). Because the samples are generated independently, we do not compare 

using weights in Eq. (31), as statistically this no longer makes sense. 

Moreover, observe that in the scheme of Pandolfi et al. (2010) (where the 

candidates are drawn independently), the weight functions in Eq. (32) be 

come ^ 3 \ yj ,x t -i) = 

Note also that this particular choice of weights tu^ can be used in the stan 

1 



where Xt-i is the previous step of the chairj 3 



dard MTM of 



Liu et al. 



and, in this 



|(|2000[) (by ch oosing X(x,y) = ^ y[xMx[y)j 
case, the technique of Pandolfi et al. ( 2010[ ) coincides with a standard MTM. 



3 Note that, in the expression of the weights oj^ 3 \yj,Xt-i), we remove the subscript j 
because in Pandolfi et al. (20101 the analytic form of the weights is the same for each 
generated sample yj, j = 1,...,N. 
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Figure[]Ja) depicts the target density p (x) (solid line) and the normalized 
histogram of 100, 000 samples drawn using our proposed scheme and N — 10. 
Figures [T](b)-(c) illustrate the mean acceptance probability and the estimated 
correlation coefficient (for different values of N and averaged using 5, 000 
runs) of the two techniques and different choice of weights: our method is 



shown with squares using , with solid l ine using uj^' an d with circles using 



(2) 



(3) 



. The performances of the method in 



Pandolfi et al. 



with dashed line corresponding to the first choice u;^ and dotted line with 
triangles for uf\ 

We can see although the proposed technique always attains smaller ac- 
ceptance rates, the resulting correlations are always smaller than the corre- 



(2010) are depicted 



(2) 

lations obtained by the other method, except using weights ujj in Eq. (31 ). 
Moreover, the best results are obtained with the proposed technique using 



(3) 

the weights in Eq. (32). In this case, the correlation decreases when N 
increases, up to 0.72 withiV = 100. 




(a) 



(b) 



(c) 



Figure 1: (a) The target density p {x) (solid line) and the normalized histogram of the 
samples generated using the proposed scheme and with N = 10. (b) The mean acceptance 
probability of jumping in a new state, dependin g on the number of tries N . We show the 
results of the technique in 



Pandolfi ct al. 



(20101 (dashed line for weights and dotted 



line with triangles for wj 3 ' 1 ) and our method (squares for Uj, solid line with and with 

circles with oj - ). (c) Estimated linear correlation coefficient depending on the number 
of tries N for the different techniques. 



6. Discussion 

In this work, we have introduced a Metropolis scheme with multiple cor- 
related points where the weight functions are not defined specifically, i.e., the 
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analytic form can be chosen arbitrarily. We proved that our novel scheme 
satisfies the detailed balance condition. 



Our approach draws from two different approaches (Pandolfi et al. 2010 



Qin and Liu, 2001) to form a novel efficient and flexible multi-point scheme. 



The multi-point approach with correlated samples provides different ad- 
vantages over the standard MTM. For instance, the multi-point procedure 
can iteratively improve the proposal pdfs in two different ways. Firstly, since 
the proposal pdfs can be distinct, as in Casarin et al. (2011), it is possible 
to tune the parameters of each proposal in every time step. Secondly, since 
the candidates are generated sequentially, successive proposal pdfs can be 
improved learning from the previously produced samples during the same 
time step. 

Moreover, in our technique, the only constraints of the weight functions 
are that they must be bounded and positive, unlike in the existing multi-point 
Metropolis algorithm (Qin and Liu, 2001) which is based on a specific defi- 
nition of the weight functions. Here the weights can be chosen with respect 
to some criteria such as improving performance or reducing computational 
complexity. Thus our method avoids any control or check the existence of 
a suitable function A and, therefore, the selection of the weight functions is 
broader and easier. 

It is interesting to observe that, in general, the function A may depend 
on the proposal pdf for a specific choice of weights and, in some cases, may 
entail certain constraints on the proposal pdf (such as that it be symmetric, 
for instance). An important consequence of this, it is that the weights can 
be chosen independently of the specific proposal pdf used in the algorithm. 
Namely, the proposal distribution and the weight functions can be selected 
separately, to fit well to the specific problem and to improve the performance 
of the technique. However, further theoretical or empirical studies are needed 
to determine the best choice of weight functions given a certain proposal and 
target density. 

Furthermore, unlike in Pandolfi et al. (2010), in our method the weights 
can depend on the previous candidates, and the dimension of the weight 
functions grows from M 2 to M. N , thus being more general and potentially 
more powerful. Figure [2] illustrates the relationships among different MTM 
schemes according to the flexibility in the choice of the proposal and weight 
functions. Finally, we have also shown a numerical simulation and a simple 
multi point scheme that provides good performances reducing the correlation 
in the produced chain. 
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Figure 2: Comparison of different MTM schemes in literature according to the flexibility 
in the choice of the proposal and weight functions. With the acronym OBMC we indicate 
the orientational bias Monte Carlo introduced by Frenkel and Smit (1996). 
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